Model, properties and imputation method of missing SNP genotype data utilizing mutual information
نویسندگان
چکیده
منابع مشابه
Missing data imputation in multivariable time series data
Multivariate time series data are found in a variety of fields such as bioinformatics, biology, genetics, astronomy, geography and finance. Many time series datasets contain missing data. Multivariate time series missing data imputation is a challenging topic and needs to be carefully considered before learning or predicting time series. Frequent researches have been done on the use of diffe...
متن کاملPosters IMPUTATION OF MISSING GENOTYPES IN HIGH DENSITY SNP DATA
The accuracy and computational complexity of five methods to impute missing genotypes in high density SNP data was investigated. The haplotype reconstruction package fastPHASE reached the highest accuracies (91% to 98%) for varying proportions (0.2% to 8%) of missing genotypes. Alternative methods based on principal component analysis were less accurate (67% to 94%), but their computational dem...
متن کاملUtilizing Genotype Imputation for the Augmentation of Sequence Data
BACKGROUND In recent years, capabilities for genotyping large sets of single nucleotide polymorphisms (SNPs) has increased considerably with the ability to genotype over 1 million SNP markers across the genome. This advancement in technology has led to an increase in the number of genome-wide association studies (GWAS) for various complex traits. These GWAS have resulted in the implication of o...
متن کاملK nearest neighbours with mutual information for simultaneous classification and missing data imputation
Missing data is a common drawback in many real-life pattern classification scenarios. One of the most popular solutions is missing data imputation by the K nearest neighbours ðKNNÞ algorithm. In this article, we propose a novel KNN imputation procedure using a feature-weighted distance metric based on mutual information (MI). This method provides a missing data estimation aimed at solving the c...
متن کاملImprovement of missing genotype imputation through bi - directional parsing of large SNP panels Christine Sinoquet
Such difficult analyses as disease association studies, which aim at mappping genetic variants underlying complex human diseases, rely on high-throughput genotyping techniques. However, a shortcoming of these techniques is the generation of missing calls. Computational inference of missing data represents a challenging alternative to genotyping again the missing regions. In this paper, we prese...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Computational and Applied Mathematics
سال: 2009
ISSN: 0377-0427
DOI: 10.1016/j.cam.2008.10.020